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Abstract 

Let Uk,N denote the Boolean function which takes as input k strings of N bits each, represent¬ 
ing k numbers aO),..., in {0,1,..., 2^—1}, and outputs 1 if and only if aO)-|-- • > 2^. 

Let THRt^„ denote a monotone unweighted threshold gate, i.e., the Boolean function which takes 
as input a single string x € {0,1}" and outputs 1 if and only \i xi-\- ■ ■ ■ -\- Xn>t. The function 
Uk,N may be viewed as a monotone function that performs addition, and may be viewed 

as a monotone function that performs counting. We refer to circuits that are composed of THR 
gates as monotone majority circuits. 

The main result of this paper is an exponential lower bound on the size of bounded-depth 
monotone majority circuits that compute Uk,N. More precisely, we show that for any constant 
d >2, any depth-c? monotone majority circuit computing Ud,N must have size 2^^^ \ Since 

Uk,N can be computed by a single monotone weighted threshold gate (that uses exponentially 
large weights), our lower bound implies that constant-depth monotone majority circuits require 
exponential size to simulate monotone weighted threshold gates. This answers a question posed 
by Goldmann and Karpinski (STOC’93) and recently restated by Hastad (2010, 2014). We also 
show that our lower bound is essentially best possible, by constructing a depth-d, size-2‘^('^^^”^) 
monotone majority circuit for Ud,N- 

As a corollary of our lower bound, we significantly strengthen a classical theorem in circuit 
complexity due to Ajtai and Gurevich (JACM’87). They exhibited a monotone function that is 
in AC^ but requires super-polynomial size for any constant-depth monotone circuit composed of 
unbounded fan-in AND and OR gates. We describe a monotone function that is in depth-3 AC° 
but requires exponential size monotone circuits of any constant depth, even if the circuits are 
composed of THR gates. 
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1 Introduction. 


“And you do Addition?” the White Queen asked. “What’s one and one and one and 
one and one and one and one and one and one and one?” 

“I don’t know, ” said Alice. “I lost count. ” 

“She can’t do Addition,” the Red Queen interrupted. 

— Lewis Carroll, Through the Looking Glass 

Threshold functions and threshold circuits. A Boolean function /: {0,1}"^ ^ {0,1} is called 
a weighted threshold function (also known as a halfspace, weighted majority, weighted threshold 
gate, or linear threshold function) if there exist integers wi,... ,Wn and t such that 

n 

f{x) = i wjXj > t. 

i=l 

The parameters wi,... ,Wn are called weights. We say that a threshold function / is unweighted if 
|rcj| = 1 for every i G {1,.. . ,n}, and that it is monotone if every weight is non-negative. (Thus a 
monotone unweighted threshold function is precisely a THRi^„ function described in the abstract.) 

Threshold functions and their generalizations have been extensively investigated for decades 
(see e.g. Dertouzos [Der65], Minsky and Papert [MP68], and Muroga [Mur71]), and arise in diverse 
areas including social choice theory (Taylor and Zwicker [TZ92]), circuit complexity (Aspnes et al. 
[ABFR94]), structural complexity (Beigel, Reingold, and Spielman [BRS95]), learning theory (Fre¬ 
und and Schapire [FS97]), neural networks (Parberry [Par94]), cryptography (Naor and Reingold 
[NR04]), and many others. 

In this work, we consider Boolean circuits that are composed of gates that compute threshold 
functions (i.e., threshold gates). (We refer to Jukna [Jukl2] as an extensive reference on Boolean 
functions and circuit complexity). While individual threshold gates may appear relatively simple. 
Boolean circuits composed of these gates (i.e., threshold circuits) remain poorly understood despite 
intensive study. For instance, it is a notorious and long-standing open problem in complexity 
theory to prove the existence of a function in NP that cannot be computed by a depth-2 circuit 
with polynomially many weighted threshold gates. This difficulty can be explained in part by the 
surprising computational power of bounded-depth threshold circuits, both in theory and practice. 
On the theory side, such circuits can efficiently implement all the basic arithmetic operations (see 
e.g., Table 1 in Sherstov [She07]) and can also simulate (in quasi-polynomial size and depth 3) 
A ND/OR/MO Dm Boolean circuits of much larger depth (Allender [A1189] and Yao [Yao90]). On a 
more practical level, constant-depth networks of (continuous analogues of) threshold gates play a 
fundamental role in recent successful deep learning frameworks (see e.g., Schmidhuber [SchlS]). 

Despite our inability to prove strong lower bounds against threshold circuits, there have been 
some notable successes in understanding the relative power of weighted versus unweighted thresh¬ 
old gates and circuits. Sin and Bruck [SB91] were the first to show that any weighted threshold gate 
can be simulated by a polynomial-size, constant-depth circuit consisting of unweighted threshold 
gates (such circuits are also known as majority circuits). This result was improved by Goldmann, 
Hastad, and Razborov in [GHR92], who showed (non-constructively) that weighted threshold gates 
can be computed by polynomial-size majority circuits of depth 2; in fact, [GHR92] showed that any 
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depth-d weighted threshold circuit can be simulated efficiently by a depth-(d + 1) majority circuit. 
Soon thereafter Goldmann and Karpinski [GK93] gave a constructive proof with better parameters 
for the size of the resulting majority circuits. Subsequent simplihcations and improvements of these 
simulations were given by Hofmeister [Hof96] and Amano and Maruoka [AM05]. 

Monotone fnnctions and monotone circnits. In a different, and highly successful, strand of 
circuit complexity research, a wide range of lower bounds have been obtained against various types 
of monotone Boolean circuits (composed of AND/OR gates only but no negations). A sequence of 
well-known results [Raz85, And85, AB87, Tar88] culminated in the existence of explicit monotone 
Boolean functions that can be computed by polynomial-size Boolean circuits but require monotone 
circuits of exponential size. Analogous results highlighting the limitations of monotone circuits 
are also known at the “low-complexity” end of the spectrum: in an important result, Ajtai and 
Gurevich [AG87] exhibited a monotone function in AC*^ (i-e., a constant-depth, polynomial-size 
AND/OR/NOT Boolean circuit) that requires monotone kOP circuits (composed of AND/OR gates) 
to have super-polynomial size. However, it should be noted that the Ajtai-Gurevich circuit lower 
bound against monotone AC*^ is quantitatively not very strong (at best a quasipolynomial 
lower bound; see discussion following the statement of the Ajtai-Gurevich theorem below). Other 
works have given alternative/simplihed expositions of the Ajtai-Gurevich lower bound and of its 
consequences in formal logic (see [BST13] for the former and Stolboushkin [Sto95] for the latter). 
But prior to the results of this paper, stronger lower bounds against monotone AC^ circuits for 
monotone functions in AC*^ remained elusive. 

This work: Monotone weighted threshold functions versus constant-depth monotone 
majority circuits. As mentioned earlier, Goldmann and Karpinski gave a constructive proof 
[GK93] that weighted threshold gates can be simulated by polynomial-size and depth-2 majority 
circuits. They also observed that even if the weighted threshold gate is monotone, known simula¬ 
tions produce majority circuits that are inherently non-monotone (i.e., they contain majority gates 
with negative weights, or equivalently, negation gates), which then led them to ask the question of 
whether an efficient monotone simulation is possible in constant depth. 

Hofmeister [Hof92] made some early progress on this question by showing that any monotone 
depth-2 majority circuit that computes the function 1 / 2 ,at from the abstract must have exponential 
size. To state the result more precisely, let us first clearly specify our notion of monotone majority 
circuits. A monotone majority circuit here is a directed acyclic graph which may have multiple 
edges (called wires). There is a single node with no outgoing wires, called the output gate. Nodes 
that have no incoming wires are called input nodes and are each labeled either 0, 1 or Xj, for some 

every other node is labeled with a monotone unweighted threshold gate THR^^m for some t, with 
m being its in-degree, which outputs 1 iff there are at least t I’s from its m input wires. We say 
the size of a monotone unweighted threshold gate THR^^m is rn (or its in-degree), and that the size 
of a monotone majority circuit is the sum of the sizes of its gates (or its number of wires). ^ Then 
Hofmeister showed that every depth-2 monotone majority circuit for U 2 ,n must have size 

As mentioned above, in subsequent work [Hof96] and [AMOS], several improvements were made 
on the Goldmann-Karpinski simulation, but neither is monotone, and no further progress was ob¬ 
tained on the lower bound side after Hofmeister’s paper [Hof92] until the current work. The question 

^Observe that by reduplicating inputs, any weighted threshold function / given by WiXi > t can be computed 
by an unweighted threshold gate of size |uii| -1- • ■ • -I- |wn|. We sometimes refer to this as the “weight of /.” 
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of Goldmann and Karpinski was recently restated by Hastad [HaslO, BHKS14]. 

1.1 Our Results. 

Our main result shows that monotone weighted threshold gates cannot be simulated by subexpo¬ 
nential size monotone majority circuits of constant depth. This may be viewed as an extension of 
Hofmeister’s depth-2 lower bound in [Hof92] to arbitrary constant depth (in fact we obtain super¬ 
polynomial size lower bounds even for circuits of small super-constant depth; see discussions after 
Theorem 1 below). We thus answer the question posed by Goldmann and Karpinski [GK93] and by 
Hastad [HaslO, BHKS14]. 

Before giving a precise statement of our results, we define formally the family of Boolean 
functions as described in the abstract. Given t > 1, we let [t] denote the set {1,... ,t}. For k >2, 
the function Uk^N maps {0, to {0,1} as follows. Given x = jg[Ar] G {0, define 

N 

SUM(x) =Y^ 2^ ^ ■ {xij Xkj) and Uk^nix) 

i=i 

It is helpful to think of the input x = {xij)i^[k],je[N] a k-iow, iV-column, and 0/1-valued matrix, 
where its ith row ..., Xi^n) gives the binary representation of a number G {0,1,..., 2^ — 1} 
in the usual way (with Xj^i being the most significant bit). Then the function Uk^N adds up the k 
numbers x*'^^ ..., x*-^) and outputs 1 if and only if the sum is at least 2^. 

With the definition of Uk,N in place, our main result can be stated as follows: 

Theorem 1. Let d> 2, n and N be three positive integers that satisfy 

n > 2®“ and N > {2^^n)^. 

Then any depth-d monotone majority circuit that computes Ud^N must have size at least . 

This lower bound is nearly optimal for any fixed d > 2, as we prove the following upper bound. 

Theorem 2. Let k,d,N > 2 be three positive integers. Then there exists a depth-d monotone 
majority circuit of size that computes Uk,N- 

Remark 1. For any fixed constant d > 2, Theorems 1 and 2 together show that the smallest 
depth-d monotone majority circuit that computes Ud,N (note that this function has d ■ N = Q{N) 
input variables) has size exp(0(iV^/'^)). In addition, by setting d = cy/log N and n = 2®^'^ for 
some small enough positive constant c so that N > (2^^n)'^, Theorem 1 implies that any depth-d 
monotone majority circuit computing Ud,N has super polynomial size (exponential in 2'^’^^°® ). 

Remark 2. As an easy consequence of Theorem 2, we obtain a slightly weaker version of the 
main result of Beimel and Weinreb [BW05]. They proved that the “universal monotone threshold 
function”^ Uo[n),o{n log N) can be computed by a poly(A^)-size, depth-0 (log A^) monotone circuit 
composed of fan-in two AND gates and unbounded fan-in OR gates. While Theorem 2 above is 
tailored for small values of k, we note that it implies that Oo( 7 v),o(Af logAr) can be computed by a 

^It is called the universal monotone threshold fnnction becanse it can simulate any monotone weighted threshold 
fnnction over N inpnts. 


1 ifSUM(x)>2^, 
0 otherwise. 
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poly(A^)-size, depth-0(log^ A^) monotone circuit composed of fan-in two AND/OR gates only. (In 
more detail, it is enough to set d = logA^ and replace each majority gate by a 0(log A^)-depth fan- 
in-two AND/OR Boolean circuit.) We sketch a simpler construction in Appendix A that matches 
the parameters obtained in [BW05] in the case of the universal monotone threshold function. 

Another consequence of our lower bound as stated in Theorem 1 is a significant strengthening 
of the Ajtai-Gurevich lower bound discussed earlier. We recall their result in more detail: 

Theorem (Ajtai-Gurevich [AG87]). There exists an explicit sequenee f = {/njnsN of monotone 
Boolean functions fn'- {0,1}"" ^ {0,1} such that: 

(t) f G ACO; 

(a) f monAC^.' For any fixed constant d, any monotone depth-d AND/OR circuit computing fn 
must have size at least Sd{n), for some function Sd{n) = 

Regarding part (ii) above, it is not immediately clear what is the best (largest) function Sd{n) that 
can be extracted from the Ajtai-Gurevich proof. However, /„ is easily seen to be computed by a 
monotone depth-2 circuit (a monotone DNF) of size so Sd{n) < for all d> 2. 

As an easy corollary of Theorem 1, we strengthen the Ajtai-Gurevich circuit lower bound 
(for a different monotone function in AC*^) in two ways: by giving a lower bound against monotone 
majority circuits of constant depth (rather than monotone circuits of AND/OR gates only), and 
by achieving an exponential size lower bound for any fixed depth (rather than a bound which is at 
most n^°®”’). Our theorem is the following: 

Theorem 3. There exists an explicit sequence g = {(7n}neN of monotone Boolean functions, where 
gn'- {0,1}"’'“S"’ —>■ {0,1}, such that: 

(i) 5GAC° {in fact each gn is computed by a poly(n)-size, depth-3 AND/OR/NOT circuit); 

(a) For any constant d >2, any monotone depth-d majority circuit for gn must have size 

It is interesting to observe that our proof of Theorem 3 uses very different arguments from those 
of Ajtai and Gurevich. The heart of their proof is a “switching lemma” for monotone functions 
on hypergrids (see the excellent exposition of their proof given in [BST13]), whereas our approach 
does not use switching lemmas at all. 

1.2 Related Work and Our Techniques. 

In addition to papers discussed above, the works of Yao [Yao89] and Hastad and Goldmann [HG91] 
are relevant in the context of our lower bound result. Let Sipser^^^ denote the read-once monotone 
n-variable formula of depth d-\-l that has alternating layers of AND and OR gates (see [HG91] for 
a detailed description of this function). Strengthening the earlier result of Yao [Yao89] , Hastad and 
Goldmann [HG91] showed that a depth-d circuit of weighted monotone threshold gates computing 
Sipser^^.,.]^ must have size In contrast, our Theorem 1 only establishes a lower bound against 

constant-depth monotone circuits of unweighted threshold gates, but — crucially — we establish 
the lower bound for a much “simpler” monotone function, Ud,N^ that is computed by a single 
weighted monotone threshold gate. Indeed, the main challenge of our work is to push through a 
lower bound for such a heavily constrained target function. 
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At the heart of our lower bound proof is a sequence of carefully constructed pairs of probability 
distributions {y£Si,MOi) over {0, for £ = 1,..., d (i.e. over possible inputs to 

for some to be specified later). The first distribution ySSi in the pair is supported on strings 
X that have = 1, while MOn is supported on strings with f7g+i^Ar^(x) = 0. The key 

property of these pairs of distributions, which yields our lower bound, is that considered together, 
each pair of [yESi^NOi) is “hard” for “small” monotone majority circuits of depth iin a. suitable 
sense. In a bit more detail, our requirement is roughly that for any such circuit F, we have 

[F{x) = l] + [F{y) = o] < 1 + n, (1) 

for a suitable value 0 < <C 1. At a high level, we establish (1) above through a careful inductive 
argument on i. (We note that the preceding sketch is something of an oversimplification; actually, 
in order for the inductive hypothesis to be “strong enough to prove itself,” we require an analogue 
of (1) both for the pair {ySSi^MOi) and for another pair of distributions {y£S'f^,MO'f), and the 
inductive argument establishing the case (. = j + 1 from the case (. = j requires careful analysis of 
yet a third carefully constructed pair {y£S*g,MO*() of distributions. See Section 2 for full details 
of the argument.)^ 

Notation and Organization. Recall that a restriction p of a function / is an assignment 
fixing some of the input variables of /. We write “/ \ p” to denote / restricted by p, a function over 
the rest of variables. We use boldface lower-case letters x, y, etc. to denote string-valued random 
variables and boldface capital letters X, Y, etc. to denote real-valued random variables. 

The rest of the paper is organized as follows. We prove Theorems 1 and 2 in Sections 2 and 3, 
respectively. We then use Theorem 1 to prove Theorem 3 in Section 4. 

2 The Lower Bound: Proof of Theorem 1. 

We prove Theorem 1 in this section. Throughout the section we use d, n and N to denote the three 
positive integers in the statement of Theorem 1 with n > 2®^'^ and N > (2^^n)'^. 

This section is organized as follows. In Sections 2.1 and 2.2, we define inductively two pairs 
{y£Si,NOi) and {y£S'i,N'0'f) of distributions over strings x € {0, for i from 1 to d, 

where is specified later and satisfies Ni < ■ ■ ■ < < N. An important property of these 

distributions is that every x drawn from y£S^^ y£S'i^ has SUM(a;) equal to 

2^^ 2^^ - 1, 2^'^ - 1 and 2^^ - (^ + 1), 

respectively. From the definition of {y£S\^MO\) and it is not too difficult to show 

that both pairs are very hard for monotone depth-1 majority circuits (Lemma 2.1), i.e. no majority 
gate with small weights can output 1 on strings drawn from y£Si with probability pi and at the 
same time output 0 on strings drawn from MOi with probability p 2 if pi +P 2 is slightly larger than 
1 (and the same holds for y£S'i and MO'i). 

^Notice that the argument we just sketched implies that Ud+i,N is hard against depth-d circuits. A more careful 
analysis at the end of the argument using the distributions {y£S‘d,J\fO*£) allows us to obtain the same lower bound 
for Ud,N, as stated in Theorem 1. 
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Then we prove our main technical lemma (Lemma 2.7) in Section 2.3, which shows by induc¬ 
tion that both pairs {y£Se,,MOii) and {y£S'^,MO'i) are hard in the same sense for “small” depth-^ 
majority circuits over {0, for every £ G [d], with {y£Si,AfOi) and {y£S'i,AfO'i) serving 

as the base case. Theorem 1 for Ud+i^N (instead of as stated) follows directely from < N 
and the property that strings x drawn from y£Sd and MOd have SUM(®) equal to 2^“^ and 2^'^ — 1, 
respectively. (Note that, although the second pair {y£S'^,MO'^) is not needed in the proof of The¬ 
orem 1 once Lemma 2.7 has been established, the intermediate pairs {y£S'^,MO'^) play a crucial 
role in the inductive definition of these distributions and the proof of Lemma 2.7.) 

In order to extend the result to Ud^N (as stated in Theorem 1), we rely on another auxiliary 
pair of distributions {y£S*^,J\fO*^) constructed during the proof, which is described in more detail 
in Section 2.2. We finally use Lemma 2.7 to prove Theorem 1 in Section 2.4. 


2.1 The Initial Two Pairs of Distributions. 

Let d, n, N be positive integers in the statement of Theorem 1. Let e and Ni'= n ■ (1/e). 

Given a string z € {0, the j-th column of z corresponds to a pair of positions (l,j) and 

(2, j), where j G [Ni]. 

We now define two pairs of distributions {y£Si,AfOi) and {y£S'i,MO'i) over {0, 
show that they are hard for monotone depth-1 majority circuits of not-too-large size. We define 
the distributions via the following sampling processes. 


• A string x ~ y£Si is generated as follows. Let R ~ [A^i] be uniformly random. We set both 
bits in the i?-th column of a; to 1. For every j > R, we set both bits in the j-th column of x 
to 0. For every j < R, we set the j-th column of x to (1,0) or (0,1) independently and with 
equal probability. For example, writing an x G supp(T<?5i) as a matrix, it would look like 


_ 1 0 0 1 
“0110 


0 

1 


1 0 0 
1 0 0 


0 0 
0 0 


and we have SUM(x) = 2^^. 


• A string y ~ J\fOi is generated by setting its j-th column to (1,0) or (0,1) independently 
and with equal probability for each j G [i^]. So a string y G supp(A^C>i) would look like 


0 10 0 
10 11 


110 1 
0 0 10 


and we have SUM(y) = 2^^ — 1. 


• y£S'i is the same as MOi. In particular, each x G supp(T^5']^) has SUM(x) = 2^i — 1. 


• Finally, a string y ~ NO'i is obtained as follows. First, sample a random x ~ y£Si. Then 
let y be the string obtained by negating each bit of x. So a string y G supp(AAC)'^) looks like 


110 1 
0 0 10 


0 

1 


0 

0 


1 1 
1 1 


j J and SUM(y) = 2^1-2. 


Recall a monotone depth-1 majority circuit of size s is simply a monotone weighted majority 
gate with total weight at most s. We show below that both pairs {y£S\,MOi) and {y£S'i,MO'i) 
defined above are hard for a monotone depth-1 circuit (to be correct on both y£Si and AfOi, or 
on both y£Si and MO'i, with nontrivial probability) unless the total weight s is large. 
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Lemma 2.1. For any depth-1 monotone majority circuit F over {0,1}^^'^^ of size at most 2” 

Pra:~y£5i = l] + Pry^ATOi [F{y) = O] <1 + 6, and (2) 

Pr^~y£5'i [F{x) = 1] + Pry-ATo'i [Fiy) = O] < 1 + e. (3) 

Proof. We present the proof of the first inequality on {ySSi^MOi). An entirely similar argument 
establishes the bound for {ySS'ijAfOf). 

Consider an auxiliary distribution V (essentially a coupling of ySSi and MOi) supported over 
{0, l}2x^i X {0, X [A^i], and defined in the following way. A draw (x, y,R) ^ V is obtained by 

selecting a uniformly random R ~ [A”!], a string y ~ MOi, and by letting x = x{y, R) G {0, 
be the string obtained by replacing the R-th column of y with (1,1), and by setting the j-th column 
of y to (0, 0) for every j > R. Observe that the marginal distributions Fx and Fy are identical to 
ySSi and AAOi, respectively. Consequently, 

LHS of Equation (2) = [F{x) = l] + Pr(^x,y,R)r^v [P(y) = O] 

= Pr [F{x) = 1 or F{y) = O] + Pr [F{x) = 1 and F{y) = O] 

< 1 + Pr [F{x) = 1 and F{y) = O]. 

Hence to prove the lemma, it is enough to show that 

Q P^{x,y,R)r^v [F{x) = 1 and F{y) = O] <6. (4) 

For every r G [Ai], let be an indicator random variable defined on F that is 1 whenever 

Wr{y) > '^Wiiy), 

£>r 


where Wj{y) wi^r • yi,r + W 2 ,r ■ 2 / 2 ,r, and Wij is the weight corresponding to the input variable of 

F at position (f,j). Informally, Y,. = 1 if and only if the weight of y with respect to F at the r-th 

column is strictly larger than the sum of the weights collected from all succeeding columns. 

We will employ the following claim to establish Equation (4). 

Claim 2.2. For every j G [Ai], we have 

Qj Pr( 3 .^y^K)^x, [F{x) = 1 and F{y) = 0 | = j] < Pr^ [Y^ = l]. 

Proof. We consider first the case where j = 1. The conditions of F{x) = 1 and R = 1 imply that 

tciq + W 2 ,i > t, where t is the threshold of F. Furthermore, because F{y) = 0 it must be the case 
that Yl,r=i^r{y) < t- These inequalities give us 

wi^i-\-W2,i-wi{y) >'^Wr{y). (5) 

r > 1 

Let y be the string obtained from y by flipping the two bits in the first column of y. Equation (5) 
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is then equivalent to wi{y) > Ylry i '^riv)- Therefore, 


qi < wi{y) > '^Wriy) I i? = 1 


r>l 


= Pr 




Wi 


(y) > '^wr{y)\R = 1 


r>l 


= Pr 




[Yi = l], 


where the last two equations use the independence of y and R as well as the fact that y and y are 
identically distributed. 

For j > 1 the result can be proved similarly by writing qj as a conditional expectation over the 
outcome of the first j — 1 columns of y, then adapting the argument above in the natural way. □ 


Claim 2.2 and the definitions of probabilities q and qj imply that 


Ni Ni 

^ [Yj = 1 ] = Bv 

j=i j=i 


Ni 

Ey. 

i=i 


In particular, there is a string y* G supp(AAOi) and a set 5* C [iVi] with [S'! > Ni ■ q such that 

^(y*) (6) 


Wr 


£>r 


for each r £ S. Recall that the weight associated to each variable in P is a non-negative integer, 
and that the total weight of F is at least '^r{y*)- It follows directly from (6) that F must have 

total weight at least 2l'^l“^. However, by assumption F has total weight at most 2"'“^. Altogether, 
we get from these inequalities and Ni = n - (1/e) that q < £, which completes the proof. □ 


2.2 A Sequence of Pairs of Pairs of Distributions. 

Next, suppose that we have defined pairs of distributions and 

over {0, for some 2 < £ < d, where a string x drawn from ySSi-i, y£S'^_^ and 

has SUM (a) equal to 

2^«-i - 1, 2^«-i - 1 and 2^^-i - {{£ - 1) + 1), (7) 

respectively. (Note that the pairs {y£Si^MOi) and {y£S'i,MO'i) have this property.) Our aim is 
to inductively define {y£S(,,J\fOi) and {y£S'^,J\fO'i) over {0, , where 

Ne n • Ne-i + 1 < 2^ • • Ni = {2nY • 2^^'^ < {2^^nY < N, fov £ £ {2,... ,d}, 

and a string x drawn from y£S£, NOg, y£S'£ and NO'^ has SUM(a;) equal to 

2^^ - 1, 2^^ - 1 and 2^^ - {£ + 1), 

8 


( 8 ) 
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Figure 1: Illustrations of how the y£S\ and MO\ distributions 
are defined from the and AfOi-i distributions. 

respectively. To this end we start by defining a pair of distributions {y£S},AfO}) over {0, 

(note that the number of rows for these distributions, £, is exactly the same as for the distributions 
y£Si-i,J\fOi-i and y£S'^_i,N'0'i_i), with 

at; n • Ni_i = Ni-1. 

To dehne {y£Sl,J\fO}), we partition the columns into n sections, each with A^^-i columns 
(and £ rows). (So the hrst section consists of all Xij with j G [iV£_i], the second section consists 
of all Xij with j G [Ni^i + 1, 2iV£_i], and so forth.) A draw of a string from is obtained as 

follows: hrst we draw an integer T uniformly from [n], and then 

(а) For each i <T, we independently set the z-th section to be a string drawn from MOi-i 
with probability 1/2 or a string drawn from y£S'^_i with probability 1/2. 

(б) For each i > T, we set the z-th section to be all 0. 

(c) For the T-th section, we set it to be a string drawn from y£Si-i. 

See Figure 1 for an illustration. A draw of a string from AfO^ is obtained in a similar fashion. First 
we draw T from [n] uniformly at random, and then 

(o') For each z < T, we independently set the z-th section to be a string drawn from J\fOi-i 

with probability 1/2 or a string drawn from y£S'^_i with probability 1/2. (Note that this is 
the same as step (a) above in the dehnition of y£S*^.) 

(b') For each i > T, we set the z-th section to be all 1 (this is different from (b) above). 

(c') For the T-th section, we set it to be a string drawn from (this is different from (c)). 
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X - y£Se 


0 

0 

z - yesi 

0 

1.1 


X ^ yssi 


0 

0 

z - yssi 

1 

0.0 


Xr^NO't 


0 

0 


0 

1.1 


Xr^MOl 


0 

0 

Zr^MO\ 

1 

(binary represent, oi £ — 1 ) 


Figure 2: Illustrations of how the y£S'^,MO'^,y£S(, and distributions 
are defined from the and distributions. 


Again see Figure 1 for an illustration. Given (7), we see that a string x drawn from y£S*i^ (or from 
MO*() has SUM(®) equal to (respectively, equal to 2^i — i). 

With the definitions of ySS}^ and MO\ in hand, we now use them to define {y£Si,MOi) and 
{y£S'^,AfO'^) so that every string x drawn from these distributions should have SUM(a;) equal to 
the values given in (8). Recall that W = + 1- 

A string x = (xij) G {0, A drawn from y£S\ is obtained as follows. First we draw a 

string 2; from y£S\ and put it in columns {2,... ,Ni} and rows {!,...,£} of x, i.e., Xjj = Zij-i 
for all i G \£\ and j G {2,... For the remaining positions (in the first column and the last 

row), we set Xjq = 0 for all i G [£ + 1] and = 1 for all j > 1. The other distribution MO'g^ is 

defined similarly, except that we draw the string 2: from instead of from The definition 

of y£S'i and MO'g^ is illustrated in Figure 2. 

For the other pair {y£Si^MOi)^ a string x drawn from y£Si is obtained as follows. As before, 
we first draw a string z from y£S\ and put it in columns {2,..., Ny\ and rows {1, ■■■,£} oi x. 
Then we set = 1 and all other variables on the first row and last column of x to be 0. For 

the other distribution MOi, we similarly draw z from and put it in columns {2,..., N^} and 
rows {1, ...,£} of a;. We set = 1 and all other variables on the first column to be 0. We 

set the last row, i.e., with j G {2,..., Nf\, to be the binary representation of ^ — 1. (This is 

well defined since > n > 2®*^'^ 3> logd > log!'.) As before, see Figure 2 for an illustration of the 
definition of y£S'i and 

We record the following two useful facts about Ng and the distributions: 

Fact 2.3. Ad < A. 

Fact 2.4. For each £ G [d], a string x drawn from y£Si, NO^, y£S\, MO\ has SUM(a;) equal to 

2N\ 2^ - 1, 2^^ - 1 and 2^ _ (£ + 1). 
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An important property of the {ySSi^MOi) pair and of the pair — which in fact 

motivated the above definitions of these distributions in terms of ySS} and J\fO} — is that they 
are at least as hard to distinguish as {y£Sl,MO}) for monotone majority circuits. 

This is made formal in the following two lemmas. 

Lemma 2.5. Given any monotone majority circuit F over there is a monotone 

majority circuit F* over {0, \ of the same size and depth as F such that 

[^(®) = 1] + [F'iy) = 0] = [F*{x) = l] + PryeATOJ [F*{y) = O]. 

Proof. Given F, we hard-wire the variables in the first column to be 0 and the rest of the variables 
in the last row to be 1. Let F* denote the new monotone majority circuit obtained from F of the 
same size and depth. The dehnition of ySS'^ and MO'^ from y£S*(^ and MO\ implies that 

[Fix) = l] = Fr^^yss*, [F*ix) = l] and 
PryeATo' [Fiy) = 0] = PrygAToi [F*iy) = O] • 


The lemma then follows. □ 

Lemma 2.6. Given any monotone majority circuit F over {0, ^ there is a monotone ma¬ 

jority circuit F* over {0, of the same size and depth as F such that 

Pra:ey£ 5 i [F{x) = l] + PrygATO^ [F{y) = O] < Pr 3 , 6 ;y£: 5 * [F*{x) = l] + PrygATOJ [F*iy) = O]. 

Proof. Given F, we hard-wire to be 1 and the rest of the variables in the hrst column and 

the last row to be 0. Let F* denote the resulting monotone majority circuit obtained from F of 
the same size and depth. The dehnition of ySS^ and AfO^ from ySS^ and implies that 

Pra,e:y£:5« [Fix) = l] = Fr^i^yes*, [F*ix) = l] and 
PryeATo^ [Fiy) = O] < PrygAToi [F*iy) = O], 

where the inequality follows from the monotonicity of F. The lemma then follows. □ 

2.3 The Key Induction Lemma. 

Given distributions dehned in Sections 2.1 and 2.2, we prove the following key technical lemma. 
Recall that e = 2“^^*^. Below we let M = 2^'^"'. 

Lemma 2.7. Let t E {2, ... ,d}. Suppose that any depth-(i — 1) monotone majority circuit F over 
{ 0 ^ 1 }^xiV£_i qJ gize at most M satisfies 

Pra:~y£5^_i [Fix) = l] + Pry-ATo^.i [Fiy) = 0] <l-\- and 

[P(®) = l] + [Fiy) = O] <1-1-7^ ^e. (9) 

Then any depth-i monotone majority circuit F* over {0,1}^^'^^* of size at most M satisfies 

Pr^^yes*, [F*ix) = l] + Pry^Aro| [F*iy) = O] < 1 + 
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Proof. Recall that strings drawn from ySS^ and AfO} consist of n sections. For convenience, we 
refer to strings in {0, as section strings. 

We begin by defining some useful distributions Vi,..., T>n over concatenations of section strings 
where Vt is supported on concatenations of t — 1 section strings. First, let T> denote the following 
distribution over section strings: a; ~ P is drawn from AfOi-i with probability 1/2 and is drawn 
from y£S'(^_i with probability 1/2. For each t G [n], we use Vt to denote the distribution of the 
concatenation of t — 1 section strings, each drawn from T> independently. (So T>t is a distribution 
over {0, .) Note that in the special case when t = 1, Pi is supported on the empty 

string only. Note also that for t G [n], P* is generated precisely according to (a) or (o') from Section 
2.2 (recall that (a) and {a') are the same). 

As in the statement of Lemma 2.7, let F* be a depth-^ monotone majority circuit on {0, 
of size at most M. We say a string z G supp(Pi) for some t G [re] is good with respect to F* if 

o ® o 0) = l] + [F* (z o y o 1) = O] > 1 + 6(5, 


where we write 0 and 1 to denote the all-0 and all-1 strings in {0, and 6 

Now we fix a t G [re] and fix a good string z G supp(Pt). Let pz be the restriction that fixes the 
first t — 1 sections of variables of F* to be z and leaves the remaining re — (t — 1) sections unfixed. 
As z is good, we have that F* \ pz is nontrivial (i.e., F* [ ^ 0 or 1). We write Hi ,..., Hm (with 

multiplicities) to denote the set of all depth-(.^ — 1) sub-circuits rooted at children of the output 
gate of F* such that Hi \ pz is nontrivial. In other words, we assume that the same sub-circuit 
may appear multiple times in this list if the output majority gate in F* contains multiple wires to 
it. Since the size of {F*) is at most M, the fan-in of the output majority gate of F* is at most M, 
and consequently m < M. Since F* [ p is nontrivial there is a positive integer h G [M] such that 
F* \ Pz outputs 1 if and only if at least h many of Pi |" p^,..., H^n \ Pz output 1. The following 
claim shows that with non-negligible probability, a random a; ~ P is such that “many” P/s become 
trivial (i.e., compute a constant function) after a restriction by Pzox'- 

Claim 2.8. Suppose that z is a good string in the support ofT>f Then we have 




|{i G [m] : Hi \ pzox is trivial }| > 


> <5/4- 


Proof. We consider two cases: h > rre/2 or h < m/2. We focus on the latter below and the former 
case is symmetric. Assume that h < m/2. Since z is good, we have 

Pry~Arc>'_i [F*{zoyol) = 0] > 1 6(5 - 1 = 6(5. 

However, if y G supp(AAC)^_]^) satisfies F*{z o y ol) = 0, then hy h < m/2 it must be the case that 
at least rre/2 of the P/s have Pj(z o y o 1) = 0, and hence 

j [number of P/s with Pi(z o ^ o 1) = O] > 3(5rre. (10) 

Let I denote the set of i G [rre] such that 

Pry~Aro'_i [Hi{z o y o 1) = 0] >25. (11) 
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Then we have from (10) that 
which implies that |I| > 5m. 

We write p to denote the restriction over {0, that fixes the first t — 1 sections of input 

variables to be z and the last (n — t) sections of input variables to be all 1, and leaves only the 
variables in the t-th section unfixed. So each Hi f /o is a depth-(t' — 1) monotone majority circuit 
over {0, of size at most M. Then combining (11) and the assumption of the lemma, i.e., 

(9), applied to Hi \ p, we have that each i G / satisfies 

[Hi{z o a; o 1) = 1] <1 + 3-26 = 1-6, 

and thus, 

[Hi{z o ® o 1) = 0] >5. (12) 

Note that if an x G supp(T^^5£_]^) satisfies Hi{z o x o 1 ) = 0, then we have Hi \ pzox = 0 by 
the monotonicity of Hi. Let X be a random variable that denotes the number of Hi’s that become 
trivial after pzox, where x ~ y£S[_i. So by (12) the expectation of X is at least <5|I|. Let q denote 
the probability that X > 5\I\/2. The lower bound E[X] > (5|/| implies that 

q.\I\ + {l-q).6\I\/2>5\I\, 

and thus q > 5/2. Plugging in |/| > 5m, we have that X > 5‘^m/2 with probability at least 5/2. 

Finally, taking into account that a draw of x ~ P is drawn from y£S'i_Y with probability 1/2, 
we see that with probability at least 5/4 over a draw of a; ~ P, we have that at least 5‘^m/2 many 
Hi’s become trivial after pzox- This finishes the proof of the claim. □ 

Claim 2.8 implies that \i z ^ T>t is good (with respect to F*), then with probability at least 
5/4 over a random draw of a; ~ P, the restriction Pzox trivializes at least (5^/2)-fraction of the 
depth-(£ — 1) sub-circuits of F that are not trivialized by pz. Intuitively, this is useful because it 
means that we have a good chance of getting a significant simplification of F (shrinking the fan-in 
of the top gate by a lot), and since F is of size at most M this cannot happen too many times. On 
the other hand, note that if z is not good, then by definition we have 

o a; o 0) = l] Pry~Arc>(_i [F* {z o y o 1) = O] < 1 -h 65, 

which intuitively is also useful for our purpose of bounding 

[F*{x) = l] + [F*iy) = O] (13) 

from above by 1 -|- 75. 

To finish the proof of the lemma, we take the following alternative but equivalent view of (13). 
Let zi,..., z„ be a sequence of random section strings, each drawn from P independently. By the 
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definition of y£S\ and MO\ (recall Figure 1), we have that 


(13) X n = E 




n 

^ Pra,~y£:5^_i [F*{zi o • • • o Zt-i o s o 0 ) = l] 
. t=i 


+ X] [-^*( 2:1 o • • • o Zt-i O y o 1) = 0 ] 

t=l 


This can be viewed as the expectation of a random variable T generated as follows. 

1. Start with F = 0. 

2. For each “round” t = 1,..., n, independently draw Zt from T) and add the following to F: 

Pra;~:V£:5£_i [-^*(^1 o • • • o Zt_i o s o 0) = l] + Pry__y^c,/_^ [-^*( 2:1 o • • • o Zt-i o y o 1) = O]. 


So it suffices to show that E[F] < (1 + 75)n. 

For each of the n rounds t = 1,... ,n, exactly one of the following two possibilities must hold: 

1. The current string zi o • • • o Zt-i is not good. In this case F goes up by at most 1 + 6<5 in 
the t-th round. Otherwise, 

2. The current string zi o • • • o Zf_i is good. In this case F can go up by at most 2 in the t-th 
round, but by our previous analysis (specifically. Claim 2.8), the number of nontrivial 
depth-(^ — 1) subcircuits of F* (with multiplicities) rooted at children of the output gate of 
F* drops by a factor of (1 — 5^/2) with probability at least 6/4 when the draw of Zt in the 
t-th round extends the restriction to Pzio-ozt- Note that F* has size M < 2^ ” so it can 
survive at most 26‘^n many such drops before F* becomes trivial; to see this, observe that 

(1 - 52 / 2 )^'^'” < exp (-(<5^/2) • (25^71)) = exp {-6^n) < 2"^®’". (14) 

Note further that once F* becomes trivial, F goes up by 1 in every subsequent round. 

We use S, a random variable, to denote the total number of rounds t G [n] such that the current 
string zi o • • • oZi_i is good (note that once F* becomes trivial the current string cannot be good). 
We claim that S < 326^ n with high probability. 

Claim 2.9. We have S < 325‘^n with probability at least 1 — exp(—n<5^/2). 

Proof. We say that round t is good if the current string zi o • • • o Zf-i is good. We say that F* 
is hit in the t-th round, if zi o • • • o zt-i is good and the number of depth-(£ — 1) subcircuitscuits 
of F* (with multiplicities) that are trivial under the restriction Pzio-ozt-i drops by a factor of at 
least (1 — d‘^/2) under the restriction Pzio-ozt-iozf Then we can write Pr[S > 326‘^n] as 

Pr [S > 326‘^n k F* is hit > 2S^n many times during the first 326^n of the good rounds] 

+ Pr [S > 326^n k F* is hit < 26^n many times during the first 326^n of the good rounds]. 

The first of these probabilities is zero because of (14), i.e. if F* is hit 26^n times then it is trivialized 
so no subsequent rounds can be good and thus F* cannot be hit again. 
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We focus on upper bounding the second probability. For each i from 1 to 32(5^n we define the 
following random variable Y* where 


Y, = 


1 if F* is hit in the i-th good round or there are fewer than i good rounds 
0 otherwise (there are at least i good rounds and F* is not hit in the ith good round). 


The second probability we are interested in is at most Pr[X^j Yj < 26^n]. By Claim 2.8, we have 

E[Yi|Yi = 6i,...,Yi_i = 6,_i] >6/4 (15) 

for all i and all 6i,..., 6j_i G {0,1}. Let Xq = 0 and 

X, = Xi_i+Y,-E[Y,| Yi,-- - ,Y,_i]. 

Then Xq, Xi, ... is a martingale that satisfies |X^; — X^-ij < 1 with probability 1, and we have that 

32S^n 32S^n 

X3252n= E (V-E[Yi|Yi,... ,Y,_i]) < ^ Y,-85V 


i=l 


2 = 1 


using (15) for the inequality. Applying the Azuma-Hoeffding inequality (see, e.g.. Theorem 5.1 of 
[DP09]) to the martingale Xo,Xi,..., we get that 


Pr 


<2(5= 


n 


< Pr [X 3252 „ < 26^n — 8(5^n] < exp ( — 


(6(5^n)^ 

2 • 32(52 n 


< exp(—n5^/2). 


This finishes the proof of the claim. □ 

We are almost done with the proof of Lemma 2.7. Recalling that 6 = we have that 

exp (—n(5^/2) < (5/4 and 6 < 2“® (16) 

since d >2, n > 2®°'^, e = and £ G {2,... , d}. It follows from Claim 2.9 that 

E [r] < exp(—n(5'^/2) • 2 ?t, + (1 — exp(—n(5‘^/2)) • (2 • 326^n + (1 + 6(5) • (n — 32(5^n)) 

< 5n/2 + 64(5^n + (1 + 6S)n < (1 + 75)n, 

where we also used the two inequalities in (16). This finishes the proof of the lemma. □ 


2.4 Proof of Theorem 1. 

Finally we combine all the ingredients to prove Theorem 1. 

Recall that d > 2, n and N are positive integers that satisfy n > 2®^*^ and N > (2^^n)'^ > N^. 
We also have £ = 2“^^*^ and M = 2^ We first prove by induction on £ that, for £ = 1,... ,d, any 


15 





monotone majority circuit F over {0, of depth £ and size at most M satisfies 


Pra:~y£5^ [F{x) = l] + [F^iv) = 0 ] <1 + 7^ and 

^''^x^yES'^ [F{x) = l] + Pry^ATo^ [F{y) = O] <1 + 7^ ^e. (17) 

The 7 = 1 base case follows from Lemma 2.1. Now assume that (17) holds for 7—1. By Lemma 
2.7, any monotone majority circuit F* over {0, of depth 7 and size at most M satisfies 

Pr.~y£ 5 | [F*{x) = 1] + [F*{y) = O] < 1 + (18) 

It follows from Lemmas 2.5 and 2.6 that every monotone majority circuit F over {0,1}^^'*'^^^'^^ of 
depth 7 and size at most M satisfies (17). This finishes the induction. 

We finish the proof using {y£S*^,NO*^) over {0, , where — I < N. Given (18) 

on {y£S*^,AfO'^) and the fact that < 1 , no depth-d monotone majority circuit on { 0 , 

of size at most M can compute correctly on all inputs, because every string x ~ y£S*^ has 

SUM(£c) = 2^d and hence Ud,N*{x) = 1, while every string y ~ A7C>^ has SUM(y) = 2^d — d and 
hence Uci^N*{y) = 0- Since N > N^, this establishes Theorem 1. 

3 The Upper Bound: Proof of Theorem 2. 

We prove Theorem 2 in this section. We focus on the case when > 1 is a positive integer, and 
give a depth-d monotone majority circuit that computes Uk,N and has size at most 

23(Afi/‘^-logA:+logA'') 

For the general case, we let n = > 1, and let s denote the smallest integer such that > N 

(so s < d). Then we first construct a depth-s monotone majority circuit that computes and 

then hard-wire the variables in the last — N columns to be 0 to get a circuit for U^^n- The size 
bound given in the statement of Theorem 2 follows from (19) and the simple facts that n < 
and n® < nN < . For the rest of the section we assume that n = > 1 is an integer. 

First we note that the theorem (with the size bound as given in (19); the same below) is trivial 
if < log A; since implementing directly using a single THR gate only takes a total weight of 
k ■ 2^ < Assuming that N > logk below, we let f G { 1 ,..., d} denote the smallest integer 

such that n* = > log k. We also write M = nf'. It is clear by the choice of t that we have 

M < n log k. (20) 

With the same reasoning the theorem is trivial if M = N. Below we assume that 7 < d — 1. 

We need some notation for our construction. We say S = {Si ,..., Si) is an i-decomposition of 
[N] if there exist indices 1 = aj" < < ... < aj < = N such that 

• For each 7 G [7], = {a~,a~ -b 1,..., a:|}; and 

• = [^ 1 - 

In other words, S partitions [A^] into 7 sequential intervals. 
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Let {xij)i^[k],j&[N] be the set of input variables of Uk,N- Given an ^'-decomposition S, we define 
a sequence of “conditional” carry-bit functions where 0 <q;,/3</c — 1 and 7 € [1]. Each 

function c^2\ depends only on the variables Xij with j ^ S-f. For convenience, let B-y = [k] x S<y be 
the set containing the indices of these variables. Intuitively, for an assignment x G {0, we 

have c^a'^^{x) = 1 if and only if a carry of value at least a is generated/propagated by the input bits 
corresponding to B^, assuming this block of variables receives a carry of value f3 from the block of 
variables to the right. Formally, 

cW(x)=^l • Xij-F/3 > a • (21) 

For each i G {0 ,... ,d — t}, we write 5^) to denote the n*-decomposition in which each set has size 
Observe that, for the 1-decomposition 5^°^ = (S'®), where S® = {1,..., A^}, we have 

Uk,N{x) = 1 4!d(®) = 1) (22) 

for the function of 

Our construction is based on a recursive computation of functions pi') associated to different 
decompositions S^'^\ for r from d — t back to 0, where each decomposition is obtained via a 

refinement of the previous decomposition 5^”^. More precisely we construct our monotone majority 
circuit for with the following intended behavior. The top gate of the circuit computes the bit 
c® (®) associated to the decomposition . However, this gate does not have access to x: it receives 
as input the output of carry-bit functions cj))'j^(x) corresponding to the finer n-decomposition 5*^^^ 
in which each block has n‘^~^ columns. This then leads to a recursive procedure, which unfolds as 
a depth-(d — t -|- 1) circuit described in more detail below (recall that t > 1). 

In general our circuit has d-tp-l layers of majority gates, where gates at the rth layer compute 
carry-bit functions corresponding to the n''^“*“*"’“^-decomposition The base case, 

i.e. the first layer of majority gates that are supposed to compute of 5"*“*, is done by a majority 
gate that follows directly the definition given in (21). It is clear that the size of each gate in the 
first layer is bounded from above by k2^. 

Due to the recursive nature of our construction, it is sufficient to describe how to compute the 
carry-bit functions corresponding to a decomposition 5^”^ from the carry-bit functions correspond¬ 
ing to for each r G {0, 1,... , d — t — 1}. For convenience we fix an r below and write S' for 

5^”) and S for We also fix a set S' G S' with S" = 5i U ... U Sn, where ^i,..., S'„ are sets 

in the ordered tuple S listed from left to right. We write Cu,v to denote a carry-bit function of the 
block S that we need to compute, for some u,v G {0,k — 1}, and assume that we have already 
computed c^^^ for each block Sj, 7 G [n], and for all a, /3 € {0,..., k — 1}. The goal is to compute 
Cu,v(x) given the bits cjj^^ix). 

We start with a general observation about carry-bit functions of a block. We say that (a, /?) -< 
{a',f3') if either a < a', or a = a' and fd > fd'. Given a block S-^, note that has the following 
monotonicity property. (Note that the assumption of jS'.yl > logk always holds given our choice of 
t and trivial cases ruled out at the beginning of the section.) 

Claim 3.1. Assume that |S'.y| > logfc. If {a,/d) -< {a',/d'), then c^"'^^(x) > c®^/(a:) on every input 
string x for Uk^N- 
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Proof. We consider the two cases corresponding to the assumption that (a,/?) {a',13'). li a = a' 

and (3 > j3', the claim follows immediately from (21). 

Assume now that a < a', where (3,(3' G — 1} are arbitrary. Clearly it suffices to argue 

that = 1 implies that q(x) = 1. Using (21), this assumption is equivalent to 

^ • Xij + (A: - 1) > a'• 2l'^-"l. (23) 

In order to show g(x) = 1, we need to verify that 


■ Xij > (a'- 1) ■ 

Using (23) it is sufficient to have A: — 1 < 21 ‘^t'L This follows from the assumption in the statement 
of the claim, which completes the proof. □ 

The description of the majority gate that computes Cu,v(x) for the block S' in S' using c^j^(x) 
for blocks Si,... ,Sn in 5 is based on the following lemma. 

Lemma 3.2. Assume that |S'g.| > log A: for every 7 G [n]. Then Cu,vix) = ^ if and only if 


n 


V 






>u-k''. 


(24) 


Proof. We consider (24) as a sum in base k over k{k — 1) rows and n columns of variables, with v 
extra I’s on column n (which corresponds to the least significant position). Let p.y denote the (base 
k) carry from column 7 to column 7 — 1 in (24), and let q.y denote the (base 2) carry from block 
7 to block 7 — 1 in our decomposition of x after adding v to block n (without taking into account 
the remaining columns of x not covered by 5i U ... U 5„). 

We prove by induction that pg, = for all 7 from n to 1. Notice that this establishes the 
lemma. For the basis when 7 = re, we consider the following two cases: 

1. If = 0 for all a > 1 and /3 > 0, then Qn = 0 (as we have c^^l_i = 0 and u < A: — 1). This 
implies that Pn = Qn = 0. 

2 . Otherwise, let {an,/3n) denote the largest pair (under -< defined earlier) with 0„ ~ 
follows from Claim 3.1 that qn = an if f3n < v, and gn = On — 1 if /3n > v. We also have 


fe-i 

V+ ^ = (an - 1) ■ k + {k - f3n + v). 

a=l,/3=0 

It follows from this equation and the characterization of that the (base k) carry pn = qn- 

The induction step is similar. We assume that p^+i = q-y+i, and prove that p-y = q^. We focus 
on the 7-th column from (24) and block 7, and consider the following two cases: 
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If = 0 for all a > 1 and /3 > 0, then = 0 (as we have = 0 and g-y+i < A: — 1). 

This implies that p-y = Q'y = 0. 

Otherwise, let {aj,/3^) denote the largest pair with ~ Using Claim 3.1, is if 

(5^ < g-y+i, and q.^ is a.^ — 1 if f5^ > q-y+i- Using the inductive hypothesis, we have 

fc-i 

P'y+l+ = («7 - 1 ) • ^ + (^ - /37 + 97 + i )' 

q=1,/3=0 

It follows from this equation and the characterization of q^ that p-y = q^y- 
This finishes the induction, and the proof of the lemma. □ 

Lemma 3.2, (21), and our previous discussions complete the description of the circuit for Uk,N- 
Moreover, its correctness follows easily from (22) and Lemma 3.2. It remains to analyze the size of 
the resulting depth-((i — t + 1) majority circuit. 

We upper bound its size layer by layer as follows. As discussed earlier, the size of each majority 
gate in the first layer is at most k2^, and there are many of them. Furthermore, for the i-th 
layer of the circuit, where i > 1, there are gates each of which has size at most 

as given in Lemma 3.2. Using (20), the majority circuit for has overall size at most 


1 . 

2 . 


d-t+l 

^d-t ^ ^ _ j^n+l 

i=2 


n 


< Nk2^ + 2Nk^+^ < 23(^'^Uogfc+iogiV)^ 


The construction presented here uses THR gates and majority circuits. We sketch in Appendix 
A an alternative construction with respect to semi-unbounded fan-in AND/OR circuits. 


4 Strengthening the Ajtai-Gurevich Result: Proof of Theorem 3. 

We require the following lemma: 

Lemma 4.1. For a suitable absolute constant 0 < c < 1, letting k = {logNY, the function Uk,N is 
computed by a poly{N)-size AND/OR/NOT circuit of depth 3. 

Proof. Recall the well-known technique of carry-save addition, also known as the “3-to-2 trick,” 
for addition of binary numbers (see e.g.. Section 1.2.3 of [Lei92]). This “trick” states that there is 
a (multi-output) circuit that takes as input three n-bit binary numbers X, Y, Z and outputs two 
(n -|- l)-bit binary numbers A, B such that {i)A + B = X + Y + Z, and (ii) each output bit 
Ay or By depends on at most 3 of the input bits. By applying this trick in parallel to the V-bit 
integers that are the rows of the input to Uk^jv, we obtain [2^/3] many (N + 1)- 

bit integers whose sum equals x^^K Recursing 0(log fc) times, we see that there are 

two (N + 0(log A:))-bit integers (call them y and z) such that y + z = xU) + ... + A naive 
composition of these “3-to-2 trick” circuits in a tree of depth 0(log/c) to compute y, z would yield 
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a circuit of depth ©(loglogA^). To avoid this blowup in circuit depth, we proceed differently, by 
observing that each each bit m, Zi depends on at most < logiV of the original input bits of 

the and exploiting this locality to get a depth-3 circuit overall. 

In more detail, let yi denote the bit in the “2®-position” of the binary representation of y, so 

N+0{logk) N+0{\ogk) 

y = yi ■ 2* and similarly z = • 2*. 

We define “generate” and “propagate” bits for each bit position oi y + z the standard way, 

def , def 

gi = yiA Zi and pi = yiV Zi, 

so gi = 1 iff the bits in the 2*-position generate a carry into the 2*+^-position, and pi = 1 iff the bits 
in the 2*-position propagate an incoming carry into the 2*-position onward to the 2*'’'^-position. 
Observe that each Pi,gi depends on at most 21og of the original input bits. 

The sum y -\- z \s at least 2^ if and only if either of the following events hold: 


• Event A: at least one of the bits yN,yN+i, ■ ■ •, zn, zn+i, ... is 1. This can be expressed as 

^ = V ^ 

j>N 

Since yj,Zj each depend on at most logA^ of the original input variables, each of them can 
be expressed as a poly(A^)-size DNF over the original input variables, and thus A can be 
expressed as a poly(A^)-size DNF. 

• Event B: a carry bit is propagated into the 2'^-position. Event B can be expressed as 


B = 



As each pi depends on at most 2 log N of the original input variables, it can be expressed as 
a poly(A')-size CNF; the same holds for gj, so 


(A ( /\ Pi j j 

V \j<i<N ) ) 

can be expressed as a poly(A^)-size CNF, and thus Event B can be expressed as a poly(A^)-size 
depth-3 OR-AND-OR circuit. 

As a consequence, Av B can be expressed as a poly(A')-size depth-3 OR-AND-OR circuit over 
the original input variables, and the lemma is proved. □ 

def 

Proof of Theorem 3: We take g^ = Uk^N, where k = {log NY as in Lemma 4.1. Then Part (i) of 
the theorem follows from Lemma 4.1. Part (ii) follows from our main lower bound. Theorem 1, by 
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observing that any circuit for Uk,N yields a circuit for Ud,N (by setting the last k — d rows of the 
input to 0). □ 
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A Upp er Bound for the Universal Monotone Threshold Gate. 

We sketch in this section a construction of monotone circuits for the universal monotone threshold 
function that matches the parameters obtained by Beimel and Weinreb [BW05]. More precisely, 
we describe a polynomial size 0(log iV)-depth AND/OR circuit for Up( n),o{n logN)^ where OR gates 
have unbounded fan-in, while AND gates have fan-in two. 

Our construction relies on a more general reduction from Uk^N to a certain graph connectivity 
problem. We start with an ^-decomposition S of (see Section 3 for more details), and assume 
(for now) that we are given the corresponding (conditional) carry-bit functions c^^(x), where a 
and /3 are in {0,... , A; — 1}, and 7 € [l\. 

Given these bits, we can view them as a layered directed graph Gs,x = (Y, E) which depends 
on X and S as follows. The vertices of G are partitioned into ^ + 1 layers, which we number for 
convenience from ^ to 0. The first and last layers are special, and contain a single vertex only. The 
remaining layers each contain k vertices. The (directed) edges of this graph leave the 7 -th layer 
and reach the (7 — l)-th layer. We use the output bit of each function to decide whether an 
edge is present in this graph. The idea is that there will be a path from the A-th layer to the 0-th 
layer if and only if Uk^Nix) = 1 - 

More precisely, we view Y = U Li^i U ... U Lq, where Li = {s}, Lq = {t}, and L.y = 

• • •) for ^ > 7 > 0. The edge set E C 1/ x Y is defined as follows. 

• {s, U£_ij) G E if and only if c) 1 = 1, where j G {0,..., A: — 1}; 

• (uij, t) G E if and only if c^-^q = 1, where j G {0,..., A: — 1}; 

• For ^ — 1 > 7 > 2 and 0 < a, (3 < k — 1, {vj^a,v-y-i,/3) G F if and only if = 1; 

• There is no other edge in E. 

Given vertices tt, n in a graph G, we write U'^ v if there exists a directed path from u to u in 
G. Our construction is based on the following observation. 

Lemma A.l. Given an i-decomposition S for Uk,N (ind an input x, 

Uk,N{x) = 1 s'^t in Gs,x- 
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Proof. We provide a sketch of the argument. If Uk,N{x) = 1, consider the sequence of carries 
generated during the actual computation of the standard binary addition algorithm. 

At least one final carry is generated in this process, since the sum is at least 2^. The correct 
carry values computed during intermediate steps of the addition algorithm correspond to a path 
from s to f in Gs^x- On the other hand, if there exists a path from s to t in this graph, then an 
inductive argument starting from t and proceeding backwards to s shows that, during each step 
of the addition algorithm, at least some number of carries must be produced when we add the 
integers In particular, there must be at least one final carry bit, which implies that 

Uk,N{x) = 1 . □ 

To sum up, in order to compute Uk,N from the carry-bit functions it is enough to solve a directed 
s-f-connectivity problem on a graph with 0{N) layers, where each layer contains 0{k) vertices. 

The computation of the carry-bit functions can be done efficiently in the case of the universal 
monotone threshold function if we start with an n(Ailog Al)-decomposition. More precisely, each 
such function can be written as a monotone majority gate over a polynomial number of input bits, 
which is known to admit efficient monotone circuits as needed in our construction. 

Finally, the upper bound follows from the well-known construction of monotone circuits for 
s-f-connectivity on layered graphs via divide-and-conquer. 
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